The Case for Byzantine Fault Detection
نویسندگان
چکیده
Distributed systems are subject to a variety of failures and attacks. In this paper, we consider general (Byzantine) failures [11], in which a failed node may exhibit arbitrary behavior. In particular, a failed node may corrupt its local state, send random messages, or even send specific messages aimed at subverting the system. Many security attacks can be modeled as Byzantine failures, such as censorship, freeloading, misrouting, or data corruption. Systems can be protected with Byzantine fault tolerance (BFT) techniques, which can mask a bounded number of Byzantine failures, e.g. using state machine replication [4]. BFT is a very powerful technique, but it has its costs. In a practical system that needs to tolerate up to f concurrent Byzantine failures, BFT cannot be implemented with less than 3f + 1 replicas [3]. Moreover, BFT scales poorly to large replica groups; as more servers are added, the throughput of the system may actually decrease [7]. In this paper, we explore an alternative approach that aims at detecting rather than masking faulty behavior. In this approach, the system does not make any attempt to hide the symptoms of Byzantine faults. Rather, each node is equipped with a detector that monitors the other nodes for signs of faulty behavior. If the detector determines that another node has become faulty, it notifies the local node, which can then take appropriate action. For example, it can cease to communicate with the faulty node; once all correct nodes have followed suit, the faulty node is isolated and the fault is contained. Specifically, we consider detection systems that are based on accountability [15]. With accountability, each action is associated with the identity of the node that has taken it, which allows the system to gather irrefutable evidence of faulty behavior. This has two important advantages: First, nodes can use the evidence to convince other nodes that a fault has occurred. Second, the evidence enables the system to resolve he-said-she-said situations in which two nodes accuse each other of having failed. Our goals in this paper are threefold: First, we examine the trade-offs between fault detection and traditional BFT. Second, we give a precise definition of the class of Byzantine faults that can be detected with this approach. Finally, we give a brief sketch of a practical system that implements such a detector.
منابع مشابه
A Robust Byzantine Fault-Tolerant Replication Technique for Peer-to-Peer Content Distribution
Problem statement: In peer-to-peer networks, Byzantine fault tolerance refers to the capability of a system to tolerate Byzantine faults. It can be achieved by replicating the server and by ensuring all server replicas reach an agreement on the input despite Byzantine faulty replicas and clients. Since malicious attacks and software errors can cause faulty nodes to exhibit Byzantine behavior, B...
متن کاملByzantine Fault Isolation in the Farsite Distributed File System
In a peer-to-peer system of interacting Byzantine-fault-tolerant replicated-state-machine groups, as system scale increases, so does the probability that a group will manifest a fault. If no steps are taken to prevent faults from spreading among groups, a single fault can result in total system failure. To address this problem, we introduce Byzantine Fault Isolation (BFI), a technique that enab...
متن کاملFault-Tolerant Wireless Multihop Transmissions with Byzantine Failure Detection
Wireless multihop networks consist of numbers of wireless nodes. Hence, introduction of failure detection and recovery is mandatory. Until now, various failure detection and recovery methods such as route switch and multiple routes detection have been proposed based on an assumption with stop failure model. However, the assumption that failed wireless nodes never transmit any messages is too re...
متن کاملModeling and Verification of Leaders Agreement in the Intrusion-Tolerant Enclaves Using PVS
Enclaves is a group-oriented intrusion-tolerant protocol. Intrusion-tolerant protocols are cryptographic protocols that implement fault-tolerance techniques to achieve security despite possible intrusions at some parts of the system. Among the most tedious faults to handle in security are the so-called Byzantine faults, where insiders maliciously exhibit an arbitrary (possibly dishonest) behavi...
متن کامل